<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
  <title>GaelSpell: English</title>
  <meta http-equiv="Content-Language" content="en">
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  <meta name="description" content="English summary of the GaelSpell home page">
  <meta name="keywords" content="GaelSpell, ispell, aspell, Irish language, spellchecker">
  <meta name="author" content="Kevin P. Scannell">
  <link rel="stylesheet" href="../kps.css" type="text/css">
</head>

<body>
<div class="content">
<h1>GaelSpell:<br>
English summary
</h1>

<h2>
<a href = "/index.html">Kevin P. Scannell</a>
</h2>

<hr>
<h2>Summary</h2>
<p>
This page has been provided as an aid to package maintainers 
and others who might be unable to read the 
<a href = "index.html"><i>GaelSpell</i> home page</a>
which is entirely in Irish.    This is in no sense a translation of 
the Irish page (which contains much more detailed descriptions).
</p>

<p>
For Linux users, we have packages which provide Irish language 
support for the most widely used spellcheckers in the Open 
Source community: <i>ispell-gaeilge</i>, for Geoff Kuenning's
<a href = "http://fmg-www.cs.ucla.edu/geoff/ispell.html"><i>International Ispell</i></a>,
<i>aspell-gaeilge</i> for Kevin Atkinson's
<a href = "http://aspell.sourceforge.net/"><i>Aspell</i></a>,
and <i>hunspell-gaeilge</i> for the
<a href="http://www.openoffice.org/">OpenOffice.org</a> 
spellchecker.  The word lists are identical, just packaged differently.
</p>

<p>
Diarmaid Mac Mathúna has also repackaged the same underlying word
list for use on Windows machines; 
it is available (under the GPL) from
the GaelSpell site:
<a href="http://www.gaelspell.com/">www.gaelspell.com</a>.
</p>

<hr>
<h2><a name="Features">Features</a></h2>

<ul>
<li><b>Large Word List.</b>  There are around 330,000 words in the database;
this is, by my estimates, about five times larger than the new Irish
spellchecker released by Microsoft 
(can't tell for sure -- it's closed-source!)  
The 
coverage is equivalent to a dictionary with around 26,000 headwords --
almost twice as big as a typical pocket dictionary (e.g. the Oxford or
the Collins Gem).
<li><b>Grammatical Completeness.</b>  I have written software which 
generates every inflected form of a dictionary headword when provided
with a limited amount of grammatical information.   For instance,
by adding the word <i lang="ga">fuaimnigh</i> to the underlying database 
as a second declension verb, 87 inflected forms 
are added to the word list (all verb endings plus lenition, eclipsis, 
prefix "d'" etc.)
<li><b>Accuracy.</b>  The only absolute rule when generating a spellchecker
is that there should be no misspelled words in the basic word lists.
Every word has been checked against print sources at least once.
The software which generates the inflected forms has been tested various
ways, including through the use of the shell script "igcheck" 
which checks a word list
for letter combinations which are illegal or "pre-standard" in Irish.
The other word lists I've seen contain anywhere from 10% to 40% English or 
misspelled Irish words.
<li><b>Frequent Updates.</b> I have provided major updates every six 
months or so since the initial release and plan to continue this for
the foreseeable future.   
Candidates for addition to the word list are harvested via
statistical methods as part of the
<a href="http://crubadan.org/"><i lang="ga">Crúbadán</i></a> web crawling project;
this is an effective way of keeping up with the latest
terminology.
I have also been adding words from the print dictionaries
published by <i lang="ga">An Gúm</i> and the resources available
from <a href="http://www.acmhainn.ie/">acmhainn.ie</a>.
<li><b>Dialect support (<i>ispell</i> only).</b>  There are three
different installation options included with the 
<i>ispell-gaeilge</i> package, described below under 
<a href="#Models">Alternate Models</a>.
<li><b>Phonetic support (<i>aspell</i> only).</b>  The file 
<i>gaeilge_phonet.dat</i> provides a complete "coarse" encoding 
of the pronunciation of Irish.  This allows <i>aspell</i> to make more
intelligent suggestions when it comes across a misspelled word.  For instance,
where <i>ispell</i> gives no suggestions for the pre-standard 
<i lang="ga">imfhiosach</i>,
<i>aspell</i> uses the phonetics file to encode this as "*M*S*K",
thereby recognizing and suggesting the correct spelling 
<i lang="ga">iomasach</i>.
</ul>

<hr>
<h2><a name="Models">Alternate Models</a></h2>

<p>
The default word list conforms strictly to standardized Irish spelling.
You can generate either a "literary" or "dialect" model
(<i>ispell</i> only) by changing the variable 
<i>INSTALLATION</i> at the top of the Makefile
to <i>gaeilgelit</i> or <i>gaeilgemor</i> and using "make" as usual.  
</p>

<p>
The <i>gaeilgelit</i> model contains many obsolete or obscure 
(but standardly spelled) words which are probably best left out
of any good Irish spellchecker.
For instance, <i lang="ga">brúitíneach</i> (a stumpy or stuffy person
in Ó Dónaill) is a likely misspelling of the much more common word
<i lang="ga">bruitíneach</i> (the measles).  Other typical 
"dangerous" word pairs: 
<i lang="ga">deirc</i> for <i lang="ga">déirc</i>.
<i lang="ga">múid</i> for <i lang="ga">muid</i>, etc.
</p>
<p>
The <i>gaeilgemor</i> model, on the other hand, contains
non-standard or dialect spellings (alongside the standard spellings)
and accepts non-standard inflections of verbs.   This greatly reduces
its effectiveness as a spellchecking tool; indeed, anyone who uses
non-standard forms so frequently that he or she finds the standard model 
inadequate will likely disagree with the very concept of an Irish
spellchecker in the first place!
</p>

<p>
With all this in mind,
<b>I strongly urge installers to make the standard model 
the default on your system</b>.
</p>

<hr>
<a rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/by-sa/3.0/80x15.png" /></a><br />This <span xmlns:dct="http://purl.org/dc/terms/" href="http://purl.org/dc/dcmitype/Text" rel="dct:type">work</span> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">Creative Commons Attribution-ShareAlike 3.0 Unported License</a>.
</div>
<div class="navigation">
<a href = "index.html">Home</a><br>
English<br>
<a href = "sios.html">Download</a><br>
<a href = "cuidiu.html">Contributing</a><br>
<a href = "sonrai.html">Change Log</a><br>
<a href = "/nlp.html" hreflang="ga">Projects</a><br>
</div>
</body>
</html>
